AITopics | hierarchical question-image co-attention

Collaborating Authors

hierarchical question-image co-attention

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Hierarchical Question-Image Co-Attention for Visual Question Answering

Neural Information Processing SystemsNov-21-2025, 15:07:23 GMT

electronic proceedings, hierarchical question-image co-attention, name change, (2 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.39)

Add feedback

Hierarchical Question-Image Co-Attention for Visual Question Answering

Jiasen Lu, Jianwei Yang, Dhruv Batra, Devi Parikh

Neural Information Processing SystemsNov-21-2025, 08:21:31 GMT

Answering (VQA) that generate spatial maps highlighting image regions relevant to answering the question. In this paper, we argue that in addition to modeling "where

machine learning, mechanism, natural language, (19 more...)

Neural Information Processing Systems

Country:

North America > United States > Virginia (0.04)
Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.72)

Add feedback

Reviews: Hierarchical Question-Image Co-Attention for Visual Question Answering

Neural Information Processing SystemsJan-20-2025, 16:27:45 GMT

The paper presents an incremental contribution with respect to previous methods for VQA that only exploit an image attention mechanism guided by question data. Here, they also consider a question attention mechanism guided by image information. In this sense, the main hypothesis of this work is that jointly considering visual and question attention mechanisms can improve the performance of current VQA systems. I agree that this hypothesis can be relevant for the case of long questions, but I believe there is also a risk that question based attention guided by image information can be misleading, in the sense that usually an image includes several information sources, while the question is more focused. In Figure 3, authors include a graph that shows the impact of question length in performance, while this figure seems to show a tendency, the effect is still weak, maybe a numerical analysis can help to support this point. I believe, an analysis of potential differences (not only question length) between most common errors of previous works (only image attention) and the proposed approach (image and question attention) can help to support the relevance of the proposed attention mechanism.

attention mechanism, hierarchical question-image co-attention, information, (8 more...)

Neural Information Processing Systems

Genre: Research Report (0.52)

Technology: Information Technology > Artificial Intelligence > Natural Language > Question Answering (0.40)

Add feedback

Hierarchical Question-Image Co-Attention for Visual Question Answering Jiasen Lu

Neural Information Processing SystemsMar-12-2024, 15:14:15 GMT

machine learning, mechanism, natural language, (19 more...)

Neural Information Processing Systems

Country:

North America > United States > Virginia (0.04)
Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.74)

Add feedback

Hierarchical Question-Image Co-Attention for Visual Question Answering

Lu, Jiasen, Yang, Jianwei, Batra, Dhruv, Parikh, Devi

Neural Information Processing SystemsFeb-14-2020, 05:26:31 GMT

A number of recent works have proposed attention models for Visual Question Answering (VQA) that generate spatial maps highlighting image regions relevant to answering the question. In this paper, we argue that in addition to modeling "where to look" or visual attention, it is equally important to model "what words to listen to" or question attention. We present a novel co-attention model for VQA that jointly reasons about image and question attention. Our model improves the state-of-the-art on the VQA dataset from 60.3% to 60.5%, and from 61.6% to 63.3% on the COCO-QA dataset. By using ResNet, the performance is further improved to 62.1% for VQA and 65.4% for COCO-QA.

dataset, hierarchical question-image co-attention, question attention

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Natural Language > Question Answering (0.66)
Information Technology > Artificial Intelligence > Machine Learning (0.50)

Add feedback

Hierarchical Question-Image Co-Attention for Visual Question Answering

Lu, Jiasen, Yang, Jianwei, Batra, Dhruv, Parikh, Devi

Neural Information Processing SystemsDec-31-2016

A number of recent works have proposed attention models for Visual Question Answering (VQA) that generate spatial maps highlighting image regions relevant to answering the question. In this paper, we argue that in addition to modeling "where to look" or visual attention, it is equally important to model "what words to listen to" or question attention. We present a novel co-attention model for VQA that jointly reasons about image and question attention. In addition, our model reasons about the question (and consequently the image via the co-attention mechanism) in a hierarchical fashion via a novel 1-dimensional convolution neural networks (CNN). Our model improves the state-of-the-art on the VQA dataset from 60.3% to 60.5%, and from 61.6% to 63.3% on the COCO-QA dataset. By using ResNet, the performance is further improved to 62.1% for VQA and 65.4% for COCO-QA.

machine learning, mechanism, natural language, (19 more...)

Neural Information Processing Systems

Country: North America > United States (0.47)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback